Investigating confounding in network‐based test‐negative design influenza vaccine effectiveness studies—Experience from the DRIVE project

Abstract Background: Establishing a large study network to conduct influenza vaccine effectiveness (IVE) studies while collecting appropriate variables to account for potential bias is important; the most relevant variables should be prioritized. We explored the impact of potential confounders on IVE in the DRIVE multi‐country network of sites conducting test‐negative design (TND) studies. Methods: We constructed a directed acyclic graph (DAG) to map the relationship between influenza vaccination, medically attended influenza infection, confounders, and other variables. Additionally, we used the Development of Robust and Innovative Vaccines Effectiveness (DRIVE) data from the 2018/2019 and 2019/2020 seasons to explore the effect of covariate adjustment on IVE estimates. The reference model was adjusted for age, sex, calendar time, and season. The covariates studied were presence of at least one, two, or three chronic diseases; presence of six specific chronic diseases; and prior healthcare use. Analyses were conducted by site and subsequently pooled. Results: The following variables were included in the DAG: age, sex, time within influenza season and year, health status and comorbidities, study site, health‐care‐seeking behavior, contact patterns and social precautionary behavior, socioeconomic status, and pre‐existing immunity. Across all age groups and settings, only adjustment for lung disease in older adults in the primary care setting resulted in a relative change of the IVE point estimate >10%. Conclusion: Our study supports a parsimonious approach to confounder adjustment in TND studies, limited to adjusting for age, sex, and calendar time. Practical implications are that necessitating fewer variables lowers the threshold for enrollment of sites in IVE studies and simplifies the pooling of data from different IVE studies or study networks.

Observational studies of seasonal IVE are susceptible to bias and confounding, and these factors need to be considered at the study design and analysis stages. Differences in disease risk or in careseeking behaviors between vaccinated and unvaccinated subjects and the difference in the probability of being vaccinated can substantially bias IVE estimates. Even when reducing selection bias, the true IVE may be overestimated or underestimated whenever confounding is present. Several strategies are available to prevent or at least reduce bias and confounding by known factors, such as restriction of the study population (e.g., to persons seeking care for respiratory symptoms in test-negative design (TND) case-control studies or to persons for whom a nasal swab was collected within a predefined number of days of symptom onset), stratification of estimates (e.g., by age group or population subgroups), matching, and statistical adjustment through multivariate regression. 2 Nevertheless, in order to make a network successful and sustainable on a large scale, tradeoffs should be made in the collection of the most relevant variables. Throughout the 5 years of the DRIVE project, field-based experts from both public and private sectors have discussed which confounders should be integrated into the generic protocol and collected by sites to ensure robust IVE estimates. EMA scientific advice was also sought by the DRIVE partners on the required number of confounders and the relevance of a parsimonious analysis. For IVE analyses in the 2017/2018 influenza season, the DRIVE took advantage of data collected through existing infrastructures and selected confounders through model building. 3 In 2018/2019, the first season for which a common protocol was developed and used, IVE estimates were adjusted for a fixed, elaborate set of confounders, namely age, sex, calendar time, presence of at least one chronic condition, pregnancy, number of hospitalizations or General Practitioner (GP) visits in the previous year, and vaccination status in previous season. 4 However, some sites were not able to collect all variables, either not at all or not for all subjects. This led to inconsistent confounder adjustment across sites and exclusion of subjects with missing values. To harmonize confounder adjustment and minimize data loss, the number of covariates adjusted for was decreased as of the 2019/2020 season, retaining only age, sex, and calendar time. 5

| DAG
We constructed a directed acyclic graph (DAG) to visually represent the relationship between influenza vaccination and medically attended laboratory-confirmed influenza infection presenting as influenza-like illness (ILI) or severe acute respiratory infection (SARI).
DAGs are visual tools used to identify confounding variables and common (including unmeasured/unmeasurable) causes of the exposure and outcome and to explicitly state the assumptions made regarding relationships between variables. 8 The full causal diagram, describing all underlying relationships among all possible variables, is often not known. However, in 2011, VanderWeele et al used the so-called "disjunctive cause criterion" to demonstrate that controlling for all observed variables that affect the exposure, the outcome or both are sufficient to control for confounding. 9 We built upon the DAG that Lane et al developed by taking covariates used in >10% of published TND studies identified in a systematic review. 6 Potential sources of confounding were identified from a systematic review on bias and confounding, 10 a systematic review on determinants of influenza vaccination uptake in older adults, 11 and expert input, and were included regardless of the operational feasibility of data collection. Using DAGitty, a browser-based environment for creating, editing, and analyzing causal diagrams, 12 we tested the minimal sufficient adjustment set for estimating the total effect of current influenza vaccination on medically attended influenza infection (i.e., for DAG closure).

| Dataset
This exploratory analysis made secondary use of the DRIVE datasets, based on TND studies conducted at four GP sites and four hospital sites in five countries in the 2018/2019 influenza season and at four GPs sites and eight hospital sites in seven countries in the 2019/2020 influenza season (Table 1). Subject characteristics are presented in supporting information 1. While the DRIVE covered the influenza seasons from 2017 to 2022, data from the other DRIVE seasons were not used because common protocols had not yet been implemented in 2017/2018, the number of influenza cases in Europe was historically low as a result of the COVID-19 pandemic in 2020/2021 and, data collection for the 2021/2022 season was still ongoing at the time of the analysis.
Data collection and site characteristics have been previously described more in depth elsewhere. 4,5 In brief, patients with ILI were enrolled in the primary care setting and patients with SARI in the hospital setting (see Table 1 for definitions). Only community-dwelling ILI and SARI patients presenting during the study period for analysis, ≤7 days after symptom onset and without contraindication for influenza vaccination, were included in the dataset. The outcome of interest was laboratory-confirmed influenza (primarily through reverse transcription polymerase chain reaction [RT-PCR]). The exposure of interest was any seasonal influenza vaccine (>14 days prior to symptom onset) in the respective season.

| Covariates considered
Covariates selected for this study had to be present in the DAG, be col-  ILI was defined as an individual presenting with sudden onset of symptoms; AND at least one of the following systemic signs or symptoms: fever/ feverishness, malaise, headache, and myalgia; AND at least one of the following respiratory symptoms: cough, sore throat, and shortness of breath. b SARI was defined as a hospitalized person with a suspicion of infection with at least one of the systemic signs or symptoms defined above or deterioration of general condition; AND at least one of the respiratory symptoms defined above, at admission or within 48 h after admission.
prevalence of at least 10% in one of the age groups. This was done as variables with a low prevalence are unlikely to lead to a large change in the coefficients. The following covariates fulfilled these criteria and

| Predictors of vaccination and the outcome
Seasonal influenza vaccines are preferentially assigned to specific population groups according to their specific indication (different vaccines are recommended depending on age and/or on the presence of medical conditions) and according to national or regional vaccine recommendations, 15

| RESULTS
A DAG was constructed ( Figure 1). We were in agreement with the rationale for including the covariates age, sex, calendar time within to the study population (too long interval between onset and specimen collection, presenting outside influenza risk period, and symptom onset <15 days after vaccination). 6 We were also in agreement with these criteria; and they were part of the DRIVE study's inclusion/ exclusion criteria. In our DAG, we combined health status, nonimmunocompromising comorbidities, and immunocompromising comorbidities into one confounder, as we considered comorbidities to be part of "health status," and we considered both types of comorbid-

| Relative change in IVE estimate
Absolute and relative changes in IVE of comparator models versus the reference models are shown in Table 3 (in the primary care setting) and is based on many factors" and is therefore "unlikely to be completely captured by a single binary indicator of whether or not a person presents himself/herself to a physician when experiencing influenza symptoms, so healthcare-seeking would remain partially unobserved and the TND design is unlikely to completely block the effects of this confounder" [46]. Healthcare-seeking behavior, in general, may also be associated with increased opportunities to be offered influenza vaccine. Healthcare-seeking behavior bias is likely more pronounced for mild disease than for severe disease. Healthcare-seeking behavior is not straightforward to operationalize, but proxies could include sex (with females being generally more prone to seek care than males 25 ), the number of recent GP visits, or upto-date pneumococcal vaccination (for adult age groups). 19 Pre-existing immunity from infection or vaccination Depending on the circulating influenza strains and the degree and duration of residual immunity, persons having recently experienced an influenza infection may be (partially) protected from influenza [47,48]. At the same time, a recent prior influenza infection has been reported by GPs as a factor that increases influenza vaccine acceptance [49]. Confounding because of immunizing infections may be expected to vary across seasons, as population-level intensity, severity of recent influenza seasons and changes in influenza vaccine composition could impact the perceived necessity of vaccination. Prior influenza vaccination may be a confounder of IVE when influenza vaccination in the current season is associated with vaccination history and when vaccination modifies the risk of natural infection because of lower previous risk of infection or persisting immunity [50]. However, prior influenza vaccination is highly predictive of influenza vaccination in the current season 29 ; this collinearity may lead to overadjustment if this variable is included in statistical models.

Social contact patterns and precautionary social behavior
Social contact patterns affect the risk of exposure to influenza virus. Social contact patterns may be related to occupation; healthcare workers with direct patient contact may be more likely to have occupational exposure to influenza, and this group is typically targeted for influenza vaccination [51]. Contact patterns have been highly associated with age and household size, whereas the average number of contacts varies between countries [52-54]. Persons working with young children may be more willing to accept vaccination if they have an additional risk factor (e.g., a medical condition). Among older adults, social inclusion into family or informal social networks-which may increase their number of contacts-was found to positively affect vaccine uptake. 11 In a study conducted among older adults in three European countries, exposure to children under the age of five living outside of the household explained 10% of all acute respiratory tract infections [55]. Precautionary social behavior affects the risk of exposure to influenza virus and may impact motivation to be vaccinated. Although precautionary behavior is always relevant in the prevention of influenza, preventive measures such as face mask wearing, physical distancing, and handwashing have become widespread since 2020 with the COVID-19 pandemic. These measures against SARS-COV-2 virus transmission also impact the circulation of other respiratory viruses such as influenza, as illustrated by the strong reductions in influenza circulation in Europe in the 2020/2021 Northern Hemisphere winter [56]. In addition, precautionary behavior such as mask wearing and distancing likely lead to a smaller dose of the initial inoculum if exposed despite the measures taken, thereby reducing the chance of developing severe disease [57]. The relevance of precautionary social behavior in IVE studies will likely depend on future COVID-19 containment measures.
Socioeconomic status and ethnicity Higher socioeconomic status or educational level may support increased vaccine uptake (in older adults), 11 and uptake has been found to be lower in certain ethnic groups (migration background, religion) [58-61]. At the same time, it may impact healthcare-seeking behavior (including accessibility of healthcare) and other social aspects such as contact patterns and health beliefs leading to precautionary behavior, and health status. frequently not medically attended or based on clinical diagnosis only.
Although vaccination status in prior seasons is likely to be retrievable, it is highly collinear with influenza vaccination in the present season and its inclusion in the statistical model may therefore lead to overadjustment. 29 The DAG describes the relationship between current influenza vaccination and medically attended laboratory-confirmed influenza infection presenting as ILI/SARI, that is, test-positive cases.
Test-negative controls, who present with non-influenza respiratory infections, are not described in the DAG. However, IVE estimates can be biased by factors that modify non-ILI ARI risk and many may be associated with influenza vaccination. 19 The present study has several limitations. This was a secondary particularly in older adults. 36 Frailty is a dynamic and multifactorial syndrome in older adults that represents a reduction in physiological reserve, limited ability to resist environmental stressors, and increased risk of functional decline. Frailty is a state of increased vulnerability to adverse outcomes compared to others of the same age. 37 Applied studies highlighted the substantial confounding effect of frailty in the context of IVE studies and underscored the importance of using this multidimensional component instead of isolated factors to account for the health status and vulnerability of study participants. 38 The statistical model used was a conditional model (also used in the DRIVE IVE studies 4,5 ), which models the effect of potential confounders on the outcome. A drawback of this model is that the noncollapsibility of the odds ratio and incidence ratio implies that changes in the IVE because of the exclusion or inclusion of a covariate might not be caused by the confounding effect. 8 This is more problematic for common outcomes (such as ILI) than for relatively rare outcomes (such as SARI). 39 There is a risk of bias associated with unadjusted IVE estimates. 42 In our study, all estimates were minimally adjusted for age, sex, calendar time, and season. The finding that adjustment for additional variables did not lead to a meaningful impact on the IVE (with one exception) can be because the evaluated covariates are not true confounders, because of the low magnitude of the effect of true confounders, and/or because of the low prevalence of the variable in the population. Omitting variables with low prevalence only has a relatively small impact on IVE compared to more prevalent conditions with the same effect on the exposure and outcome. 43 Although most estimates of the relative change in IVE were close to 1.0, indicating little to no impact of additional adjustment on the IVE estimates in the current dataset, variability was observed in the width of the CI. Multiple covariates analyzed in adults and older adults in the primary care setting and in children in the hospital setting had a CI that encompassed a > 10% relative change in the IVE, which represents an important side note when generalizing the findings to potential future IVE studies. In addition, in the primary care setting, the only stratification for which a confounder was found that led to a > 10% relative change in the IVE, the number of older adults was substantially lower than for the other age and setting stratifications.
A strength of the study is that the analysis is based on two seasons of data from a network of multiple European countries using harmonized data collection (through a core protocol and codebook).
Furthermore, we started by building a theoretical framework to justify the selection of covariates in the confounder analysis and provided a

ACKNOWLEDGEMENTS
The authors would like to acknowledge Kaatje Bollaerts for the initial work done on confounding using the DRIVE data and Alejandra Gonzalez Diaz for editorial support. In addition, they would like to thank all patients and staff involved in the studies at the DRIVE study sites in 2018/2019 and 2019/2020. McGovern is an employee of Seqirus and holds shares in Seqirus.

CONFLICTS OF INTEREST
Ainara Mira-Iglesias has no potential conflict of interest. Jos Nauta is employed by Abbott, a company that produces influenza vaccines, and hold shares in Abbott. Laurence Torcel-Pagnon is an employee of Sanofi and holds shares in Sanofi. Jorne Biccler is an employee of P95.

ETHICS STATEMENT
This study is a reanalysis of previously collected data. In the original studies, each local study was approved by national, regional, or institutional ethics committees, as appropriate. In the case of ISS, the study was submitted to the ethics committee for information, but approval was not required as the study is nested in the National

CONSENT TO PARTICIPATE AND FOR PUBLICATION
This study is a reanalysis of previously collected data. In the original studies, all participants provided informed consent, when required by the ethics committee.